SIFT-based local spectrogram image descriptor: a novel feature for robust music identification

نویسندگان

  • Xiu Zhang
  • Bilei Zhu
  • Linwei Li
  • Wei Li
  • Xiaoqiang Li
  • Wei Wang
  • Peizhong Lu
  • Wenqiang Zhang
چکیده

Music identification via audio fingerprinting has been an active research field in recent years. In the real-world environment, music queries are often deformed by various interferences which typically include signal distortions and time-frequency misalignments caused by time stretching, pitch shifting, etc. Therefore, robustness plays a crucial role in music identification technique. In this paper, we propose to use scale invariant feature transform (SIFT) local descriptors computed from a spectrogram image as sub-fingerprints for music identification. Experiments show that these sub-fingerprints exhibit strong robustness against serious time stretching and pitch shifting simultaneously. In addition, a locality sensitive hashing (LSH)-based nearest sub-fingerprint retrieval method and a matching determination mechanism are applied for robust sub-fingerprint matching, which makes the identification efficient and precise. Finally, as an auxiliary function, we demonstrate that by comparing the time-frequency locations of corresponding SIFT keypoints, the factor of time stretching and pitch shifting that music queries might have experienced can be accurately estimated.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DPML-Risk: An Efficient Algorithm for Image Registration

Targets and objects registration and tracking in a sequence of images play an important role in various areas. One of the methods in image registration is feature-based algorithm which is accomplished in two steps. The first step includes finding features of sensed and reference images. In this step, a scale space is used to reduce the sensitivity of detected features to the scale changes. Afterw...

متن کامل

Local gradient pattern - A novel feature representation for facial expression recognition

Many researchers adopt Local Binary Pattern for pattern analysis. However, the long histogram created by Local Binary Pattern is not suitable for large-scale facial database. This paper presents a simple facial pattern descriptor for facial expression recognition. Local pattern is computed based on local gradient flow from one side to another side through the center pixel in a 3x3 pixels region...

متن کامل

Local Image Descriptor using VQ-SIFT for Image Retrieval

In this paper, we present local image descriptor using VQ-SIFT for more effective and efficient image retrieval. Instead of SIFT's weighted orientation histograms, we apply vector quantization (VQ) histogram as an alternate representation for SIFT features. Experimental results show that SIFT features using VQ-based local descriptors can achieve better image retrieval accuracy than the conventi...

متن کامل

A novel Local feature descriptor using the Mercator projection for 3D object recognition

Point cloud processing is a rapidly growing research area of computer vision. Introducing of cheap range sensors has made a great interest in the point cloud processing and 3D object recognition. 3D object recognition methods can be divided into two categories: global and local feature-based methods. Global features describe the entire model shape whereas local features encode the neighborhood ...

متن کامل

Efficient music identification using ORB descriptors of the spectrogram image

Audio fingerprinting has been an active research field typically used for music identification. Robust audio fingerprinting technology is used to successfully perform content-based audio identification regardless of the audio signal being subjected to various types of distortion. These distortions affect the time-frequency correlation relating to pitch and speed changes. In this paper, experime...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Audio, Speech and Music Processing

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015